Social  Network  Change  Detection 


Ian  A.  McCulloh  and  Kathleen  M.  Carley 

March  17,  2008 
CMU-ISR-08-116 


Institute  for  Software  Research 
Carnegie  Mellon  University 
School  of  Computer  Science 
Pittsburg,  PA  15213 


Center  for  the  Computational  Analysis  of  Social  and  Organizational  Systems 
CAS  OS  technical  report. 


This  is  a  project  of  the  Center  for  Computational  Analysis  of  Social  and  Organizational  Systems 
(CASOS).  This  work  was  supported  in  part  by  the  Army  Research  Labs  Grant  No.  DAAD19-01-2-0009, 
the  Office  of  Naval  Research  (ONR),  United  States  Navy  Grant  No.N00014-02-10973  on  Dynamic 
Network  Analysis,  the  Air  Force  Office  of  Sponsored  Research  (MURI:  Cultural  Modeling  of  the 
Adversary  Organization,  600322) ,  and  the  NSF IGERT  program  in  CASOS  (DGE-9972762). 


Form  Approved 
0MB  No.  0704-0188 


Report  Documentation  Page 


Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  0MB  control  number. 


1.  REPORT  DATE 

17  MAR  2008 

2.  REPORT  TYPE 

3.  DATES  COVERED 

00-00-2008  to  00-00-2008 

4.  TITLE  AND  SUBTITLE 

Social  Network  Change  Detection 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Carnegie  Mellon  University ,School  of  Computer  Science,Institute  for 
Software  Research,Pittsburgh,PA,15213 

8.  PEREORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR’S  ACRONYM(S) 

11.  SPONSOR/MONITOR’S  REPORT 
NUMBER(S) 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

Changes  in  observed  social  networks  may  signal  an  underlying  change  within  an  organization,  and  may 
even  predict  significant  events  or  behaviors.  The  breakdown  of  a  team?s  effectiveness,  the  emergence  of 
informal  leaders,  or  the  preparation  of  an  attack  by  a  clandestine  network  may  all  be  associated  with 
changes  in  the  patterns  of  interactions  between  group  members.  The  ability  to  systematically,  statistically, 
effectively  and  efficiently  detect  these  changes  has  the  potential  to  enable  the  anticipation  of  change, 
provide  early  warning  of  change,  and  enable  faster  response  to  change.  By  applying  statistical  process 
control  techniques  to  social  networks  we  can  detect  changes  in  these  networks.  Herein  we  describe  this 
methodology  and  then  illustrate  it  using  three  data  sets.  The  first  deals  with  the  email  communications 
among  graduate  students.  The  second  is  the  perceived  connections  among  members  of  al  Qaeda  based  on 
open  source  data.  The  results  indicate  that  this  approach  is  able  to  detect  change  even  with  the  high  levels 
of  uncertainty  inherent  in  these  data. 


15.  SUBJECT  TERMS 


16.  SECURITY  CLASSIEICATION  OE: 

17.  LIMITATION  OE 
ABSTRACT 

18.  NUMBER 
OE  PAGES 

19a.  NAME  OE 
RESPONSIBLE  PERSON 

a.  REPORT 

unclassified 

b.  ABSTRACT 

unclassified 

c.  THIS  PAGE 

unclassified 

Same  as 
Report  (SAR) 

26 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


Keywords:  Social  Networks,  Change  Detection,  Statistical  Process  Control,  CUSUM, 
Al-Qaeda,  IkeNet,  Terrorism 


Abstract 


Changes  in  observed  social  networks  may  signal  an  underlying  change  within  an 
organization,  and  may  even  predict  significant  events  or  behaviors.  The  breakdown  of  a 
team’s  effectiveness,  the  emergence  of  informal  leaders,  or  the  preparation  of  an  attack 
by  a  clandestine  network  may  all  be  associated  with  changes  in  the  patterns  of 
interactions  between  group  members.  The  ability  to  systematically,  statistically, 
effectively  and  efficiently  detect  these  changes  has  the  potential  to  enable  the  anticipation 
of  change,  provide  early  warning  of  change,  and  enable  faster  response  to  change.  By 
applying  statistical  process  control  techniques  to  social  networks  we  can  detect  changes 
in  these  networks.  Herein  we  describe  this  methodology  and  then  illustrate  it  using  three 
data  sets.  The  first  deals  with  the  email  communications  among  graduate  students.  The 
second  is  the  perceived  connections  among  members  of  al  Qaeda  based  on  open  source 
data.  The  results  indicate  that  this  approach  is  able  to  detect  change  even  with  the  high 
levels  of  uncertainty  inherent  in  these  data. 
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1.  INTRODUCTION 


Organizations  are  not  static,  and  over  time  their  structure,  composition,  and 
patterns  of  communication  may  change.  These  changes  may  occur  quickly,  such  as  when 
a  corporation  restructures,  but  they  often  happen  gradually,  as  the  organization  responds 
to  environmental  pressures,  or  individual  roles  expand  or  contract.  Often,  these  gradual 
changes  reflect  a  fundamental  qualitative  shift  in  an  organization,  and  may  precede  other 
indicators  of  change.  It  is  important  to  note,  however,  that  a  certain  degree  of  change  is 
expected  in  the  normal  course  of  an  unchanging  organization,  reflecting  normal  day-to- 
day  variability.  The  challenge  of  Social  Network  Change  Detection  is  whether  metrics 
can  be  developed  to  detect  signals  of  meaningful  change  in  social  networks  in  a 
background  of  normal  variability. 

Organizations  can  be  represented  with  many  different  networks.  Relationships 
between  people  form  social  networks.  Relationships  between  people  and  their 
knowledge,  resources,  tasks,  beliefs,  and  other  dimensions  all  form  networks  as  well. 
The  collection  of  these  networks  is  referred  to  as  a  meta-network  (Krackhardt  and  Carley, 
1998).  One  advantage  in  representing  organizations  using  meta-networks  is  the  ability  to 
mathematically  quantify  and  represent  complex  interrelated  organizational  behavior.  In 
addition,  network  representations  of  organizations  can  have  a  visual  appeal  that  enhances 
insight  and  understanding  of  organizational  dynamics.  If  we  accept  the  notion  that 
organizations  consist  of  a  meta-network  of  relationships,  the  data  collected  on  the 
organization  over  time  can  be  used  to  construct  observed  instances  of  the  network.  Due 
to  normal  fluctuations  in  behavior  and  data  collection  errors,  it  is  conceivable  that  an 
observed  network  might  differ  slightly  from  the  actual  underlying  network  of 
organizational  relations.  How  then,  can  we  detect  statistically  meaningful  changes  in  the 
organization,  within  this  meta-network  representation?  This  paper  proposes  an  approach 
that  is  focused  on  social  networks,  but  could  be  expanded  to  include  other  network 
dimensions  in  the  future. 

Social  Network  Analysis  (SNA)  is  an  approach  to  studying  and  analyzing  groups 
of  actors  and  their  ties.  When  applied  to  communication  networks,  SNA  enables  us  to 
quantitatively  analyze  the  patterns  of  information  flow  through  time  and  space  (Monge  & 
Contractor,  2003).  These  techniques  can  be  used  to  characterize  the  roles  of  individuals 
in  groups,  compare  subgroups  with  one  another,  or  describe  the  informal  structure  of 
large  organizations  (Wasserman  &  Faust,  1994). 

There  has  been  a  recent  increase  in  temporal  social  network  data  (McCulloh, 
et.ak,  2007).  Unobtrusive  tools  now  exist  to  extract  network  data  from  e-mail  servers, 
from  news  media,  from  written  documents  within  an  organization.  This  allows  an  analyst 
to  construct  multiple  network  observations  of  an  organization,  whether  it  is  daily,  weekly, 
yearly,  or  any  other  temporal  breakdown.  With  the  increased  emergence  of  observed 
instances  of  social  networks  over  time,  improved  methods  of  detecting  meaningful 
change  are  needed.  Simply  looking  for  obvious  drastic  changes  may  be  insufficient  for 
many  applications. 
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2.  BACKGROUND 


Current  methods  of  change  detection  in  social  networks  are  limited.  Hamming 
distance  (Hamming,  1950)  is  often  used  in  binary  networks  to  measure  the  distance 
between  two  networks.  Euclidean  distance  is  similarly  used  for  weighted  networks 
(Wasserman  and  Faust,  1994).  While  these  methods  may  be  effective  at  quantifying  a 
difference  in  static  networks,  they  lack  an  underlying  statistical  distribution.  This 
prevents  an  analyst  from  identifying  a  statistically  significant  change,  as  opposed  to 
normal  and  spurious  fluctuations  in  the  network.  Social  Network  Change  Detection 
significantly  improves  on  previous  attempts  to  detect  organizational  change  over  time  by 
introducing  a  statistically  sound  probability  space  and  uniformly  more  powerful  detection 
methods. 

Several  methods  for  studying  social  networks  over  time  have  been  proposed  in  the 
literature.  Exponential  Random  Graph  Models  (ERGM)  include  structural  variables  to 
predict  future  graph  evolution  (Handcock  and  Morris,  2005;  Goodreau,  2007;  Robins,  et. 
al.,  2007).  The  software  package  SIENNA  is  often  used  to  study  longitudinal  data 
(Snijders,  et.  al.,  2007).  The  Network  Probability  Matrix  (NPM)  approach  makes 
different  assumptions  than  the  ERGM  and  uses  historic  relationships  to  predict  future 
networks  (McCulloh,  et.  al.  2007).  Conceptual  models  such  as  preferential  attachment 
and  fitness  models  have  been  used  to  predict  the  future  behavior  of  network  evolution 
through  time.  While  it  may  yet  be  unclear  which  method  more  closely  resembles  the  true 
evolution  of  networks,  all  methods  provide  an  analyst  with  a  means  to  understand  a 
possible  underlying  statistical  distribution  for  social  network  measures.  Statistical 
distributions  have  been  fit  to  several  data  sets,  using  the  NPM  and  empirical  approaches 
(McCulloh,  et.  al.,  2007;  Bailer,  et.  al.,  2008).  Findings  indicate  that  measures  of  average 
centrality,  average  betweenness,  and  density  are  all  normally  distributed  for  networks  of 
greater  than  30  nodes.  These  findings  suggest  that  the  necessary  assumptions  for  many 
statistical  process  control  charts  may  be  satisfied  for  these  three  measures. 

Social  Network  Change  Detection  is  a  process  of  monitoring  networks  to 
determine  when  significant  changes  to  their  organizational  structure  occur  and  what 
caused  them.  We  propose  that  techniques  from  SNA,  combined  with  those  from 
statistical  process  control  can  be  used  to  detect  when  significant  changes  occur  in  a 
network.  In  application,  it  requires  the  use  of  statistical  process  control  charts  to  detect 
changes  in  observable  network  measures.  By  taking  measures  of  a  network  over  time,  a 
control  chart  can  be  used  to  signal  when  significant  changes  occur  in  the  network.  We 
describe  our  technique  below.  First,  providing  an  overview  of  the  relevant  SNA  and 
statistical  process  control  approach,  then  describing  the  impact  of  applying  this  to 
relational  data,  and  which  social  network  measures  are  suitable  for  monitoring.  We 
follow  that  with  demonstrations  of  the  technique  on  two  distinct  network  data  sets,  the 
emails  between  Army  officers  in  a  graduate  program,  the  patterns  of  communication 
between  members  of  Al-Qaeda. 

Social  Network  Analysis 
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SNA  provides  the  basis  for  how  networks  are  modeled,  measured,  and  compared. 
A  typical  social  network  can  be  modeled  on  a  graph  with  people  represented  as  vertices 
and  links  between  them  as  edges.  (Scott,  2002;  Wasserman  and  Faust,  1994).  These 
edges  can  represent  a  wide  variety  of  links  including  exchanged  emails,  shared  religious 
beliefs,  or  attendance  at  the  same  university.  Edges  may  be  weighted  to  show  the 
importance  of  the  link.  For  example,  the  weight  could  be  how  many  emails  were  sent 
over  the  data  collection  time  period.  Edges  may  also  be  directed  to  show  who  is 
initiating  the  link  and  who  receiving  it.  The  simplest  social  networks  have  just  one  edge 
set  that  is  un-weighted  and  undirected. 

There  are  many  network  measures  that  can  be  calculated  from  a  given  graph. 
Network  measures  can  be  calculated  from  the  entire  graph  or  for  each  individual  node. 
Centrality  network  measures  such  as  betweenness  and  closeness  are  widely  used  for  their 
easily  applied  practical  applications  in  determining  how  information  spreads  through  a 
social  network.  Eor  illustration  this  paper  will  use  one  graph  level  measure,  density 
(Coleman  and  More,  1983);  and  two  individual  node  measures  averaged  over  the  graph, 
closeness  (Ereeman,  1979)  and  betweenness  (Ereeman,  1977).  These  are  chosen  because 
they  are  commonly  used  in  the  literature  and  represent  a  range  of  the  types  of  measures 
available  for  change  detection. 

Despite  the  practicality  of  these  measures,  several  problems  arise  from  their 
usage.  First,  these  individual  measures  must  be  translated  into  a  network  picture  of  the 
entire  graph.  This  may  be  as  simple  as  averaging  the  measures  across  the  entire  graph 
and  using  that  as  the  measure  for  each  time  period.  An  alternative  method  would  be  to 
use  either  the  maximum  or  minimum  value  from  nodes  within  the  graph  as  the  sample. 
Unlike  in  Everett  and  Borgatti’s  paper  (1999)  one  eannot  reealculate  the  network  measure 
by  collapsing  the  graph  into  a  single  node  and  analyzing  its  links  with  nodes  outside  the 
group  because  our  group  involves  the  entire  graph  and  the  result  would  be  trivial.  One 
must  thus  explore  how  both  the  individual  measures  and  average  measures  are  distributed 
and  whether  the  average  is  good  representation  for  the  entire  graph.  A  second  difficulty 
with  these  measures  is  their  normalization.  In  order  to  compare  measures  across  different 
time  periods,  they  must  be  normalized.  Eor  a  steady  sized  group  this  should  not  be  an 
issue,  but  in  the  case  of  an  expanding  or  contracting  group,  issues  arise  as  to  whether 
results  can  be  used  across  the  different  scales  of  group  size.  In  other  words,  the  network 
measures  may  change  in  different  ways  with  respect  to  the  current  group  size  and  thus 
provide  inconsistent  information  about  the  group  even  absent  of  any  changes  within  the 
group.  For  this  research,  the  Organizational  Risk  Analyzer  (ORA)  developed  by 
Kathleen  Carley  at  the  Center  for  Computational  Analysis  of  Social  and  Organizational 
Systems  at  Carnegie  Mellon  University  is  used  to  compute  the  average  network  measures 
from  all  group  information  (Carley,  2007). 

Statistical  Process  Control 

The  second  component  for  social  network  change  detection  is  Statistical  Process 
Control  (SPC).  SPC  is  a  technique  used  by  quality  engineers  to  monitor  industrial 
processes.  They  use  control  charts  to  detect  changes  in  the  mean  of  the  industrial  process 
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by  taking  periodic  samples  of  the  product  and  tracking  the  results  against  a  control  limit. 
Once  a  change  has  been  detected,  the  engineers  determine  the  most  likely  time  the 
change  occurred  to  reexamine  and  reset  the  process  to  avoid  financial  loss  for  the 
company  by  making  substandard  or  wasteful  product.  Control  charts  are  usually 
optimized  for  their  processes  to  increase  their  sensitivity  for  detecting  changes,  while 
minimizing  the  number  of  false  alarms  -  signals  when  no  change  has  actually  occurred  in 
the  process. 

The  control  chart  investigated  for  this  project  was  the  cumulative  sum  (CUSUM). 
The  CUSUM  control  chart  is  a  widely  used  control  chart  derived  from  the  sequential 
probability  ratio  test  (SPRT)  (Page,  1961).  The  SPRT  was  derived  in  turn  from  the 
Neyman  and  Pearson  (1933)  most  powerful  test  for  a  simple  hypothesis. 

The  decision  rule  of  the  CUSUM  chart  runs  off  the  cumulative  statistic 

t 

c,  =  Z(Z,  -^) 

j=i 

where  Z .  is  the  standardized  normal  of  each  observation, 

’  cr_ 

and  the  common  choice  for  k  is  0.5  (McCulloh,  2004),  which  corresponds  to  a 
standardized  magnitude  of  change  of  1.  The  CUSUM  control  chart  sequentially 
compares  the  statistic  C,  against  a  control  limit  a'  until  C,  >  a'  .  Since  we  are  not 
interested  in  concluding  that  the  network  is  unchanged,  the  cumulative  statistic  is 

C*  =  max{  0,Z^-  k  +  C*_^} 

The  statistic  C,Ms  compared  to  the  constant  control  limit,  .  If  C*  >  h* ,  then  the 

control  chart  signals  that  an  increase  in  a  network  measure  has  occurred.  Since  this  rule 
only  detects  increases  in  the  mean,  a  second  cumulative  statistic  rule  must  be  used  to 
detect  decreases  in  the  mean. 

C;  =  max{  0,-Z,  -  k  +  C,_|) 

which  signals  a  decrease  in  a  network  measure’s  mean  when  C,  >  h  . 

The  CUSUM  control  chart  was  selected  for  two  reasons.  First,  this  chart  is  well 
suited  to  detecting  small  changes  in  the  mean  of  a  process  over  time.  In  terms  of  a  social 
network,  this  is  a  desired  quality  because  one  would  not  expect  a  social  network  to 
change  dramatically  between  short  time  periods.  By  casual  observation,  one  could 
conclude  that  a  person’s  friends  generally  stay  the  same  from  week  to  week  and  not 
expect  drastic  changes  in  that  social  network.  In  addition,  drastic  changes  in  the  network 
are  normally  quite  obvious,  but  since  the  CUSUM  is  good  at  detecting  slight  changes  it 
may  be  able  to  provide  early  warning  for  drastic  changes,  or  reveal  when  more  subtle 
changes  have  occurred.  A  second  benefit  of  the  CUSUM  control  chart  is  its  built-in 
change  point  detection.  After  the  control  chart  signals,  the  most  likely  change  point  is 
found  by  tracing  the  C  statistic  back  to  the  last  time  it  was  zero.  This  allows  the  time  of 
the  change  in  the  network  to  be  calculated  quickly  and  easily. 
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3.  METHOD 


Social  network  change  detection  algorithms  are  implemented  in  much  the  same 
way  a  control  chart  is  implemented  in  a  manufacturing  process.  The  average  graph 
measures  for  density,  closeness,  and  betweenness  centrality  are  calculated  for  several 
consecutive  time-periods  of  the  social  network.  When  these  measures  appear  to  have 
stabilized  over  time,  the  “in-control”  mean  and  variance  for  the  measures  of  the  network 
are  calculated  by  taking  a  sample  average  and  sample  variance  of  the  stabilized  measures. 
The  subsequent,  successive  social  network  measures  are  then  used  to  calculate  the 
CUSUM’s  and  C  statistics.  These  were  then  compared  to  a  control  limit  to  determine 
when  or  if  the  control  chart  signals  a  change  in  the  mean  of  the  monitored  network 
measure.  Upon  receiving  a  signal,  the  change  point  is  calculated  by  tracing  the  signaling 
or  C  statistic  back  to  the  last  time  period  it  was  zero.  In  order  to  continue  running  the 
control  chart  after  a  signal,  the  in-control  mean  and  variance  are  recalculated  after  the 
network  measures  have  stabilized  following  the  change. 

The  suspected  time  periods  when  the  network  appears  to  be  significantly 
changing  can  be  estimated  using  the  CUSUM  statistic.  The  network  can  then  be  studied 
in  depth  across  these  time  periods  in  the  wide  variety  of  network  measures  to  determine 
the  extent  of  changes  to  the  network  structure.  Further  study  can  also  be  directed  towards 
determining  changes  in  the  environment  in  which  the  network  operates  during  those  time- 
periods. 


4.  DATA 

Two  data  sets  are  used  to  demonstrate  the  efficacy  of  the  social  network  change 
detection  approach.  The  first  data  set  is  email  traffic  from  a  group  of  24  Army  officers  in 
a  one  year  graduate  program  at  Columbia  University.  This  program  is  known  as  the 
Tactical  Officer  Education  Program.  The  second  data  set  is  an  open  source  Al-Qaeda 
social  network.  Details  of  these  data  sets  are  provided. 

Tactical  Ojficer  Education  Program  e-mail  Network 

The  Tactical  Officer  Education  Program  (TOEP)  is  a  one-year  graduate  program 
run  as  a  joint  effort  by  the  United  States  Military  Academy  (USMA)  and  Columbia 
University.  Each  year,  twenty-four  Army  officers  (referred  to  in  this  study  as  TOEPs  1 
through  24)  enter  the  program  to  earn  a  Master’s  degree  in  Soeial-Organizational 
Psychology  with  a  concentration  in  Leadership  and  to  prepare  for  service  as  mentors  for 
West  Point’s  eadet  companies  during  the  following  two  years.  Social  network  data  on 
email  communication  was  collected  for  24  weeks.  Details  regarding  the  data  collection 
and  network  properties  are  described  in  McCulloh,  et.  al.  (2007). 
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The  data  were  pre-processed  before  any  social  network  change  detection 
algorithms  were  performed.  The  first  step  of  processing  the  raw  data  was  to  remove  all 
emails  sent  outside  of  the  TOEP  network.  The  primary  concern  of  the  study  was  to 
examine  how  email  communication  changed  within  the  exclusive  group  of  TOEP 
students.  This  required  that  records  of  emails  sent  to  non-TOEPs  and  email  addresses  of 
non-TOEPs  in  messages  that  were  sent  to  mixed  parties  were  deleted.  Thus,  all 
subsequent  network  pictures  would  only  involve  the  email  communication  among  the  24 
TOEPs.  Despite  our  best  efforts  though,  the  network  information  can  only  be  viewed  as 
“near”  complete  as  emails  sent  using  Webmail  are  not  collected  because  of  limitations  of 
the  data  collection  software  (McCulloh,  et.  al.  2007). 

The  data  were  then  separated  it  into  weekly  time  periods.  Too  much  variance 
existed  in  the  data  set  if  it  were  to  be  divided  into  monthly  time  periods  (McCulloh,  et.  al. 
2007).  This  variance  was  due  to  communication  patterns  that  changed  between  months 
of  schoolwork  (e.g.,  October  and  Eebruary)  and  those  of  long  break  periods  (e.g., 
December  and  March).  These  large  changes  in  communication  patterns  would  prevent 
unbiased  calculation  of  the  baseline  measurements  with  which  to  calibrate  the  control 
chart.  Dividing  the  data  based  on  days  provided  too  much  resolution  and  was  also 
unacceptable  as  network  communication  patterns  change  dramatically  from  weekdays  to 
weekends. 

The  network  measures  of  interest  were  selected  because  they  should  theoretically 
follow  or  approximate  a  normal  distribution  due  to  the  central  limit  theorem.  Eor 
veraeity,  the  measures’  distributions  were  verified  so  that  usage  of  the  CUSUM  Control 
Chart  could  be  justified.  Each  of  the  network  measures  were  fit  with  five  continuous 
distributions:  normal,  uniform,  gamma,  exponential,  and  chi-squared.  Least  Squares  was 
used  to  determine  the  best  overall  distribution  for  each  measure.  The  distribution  with 
the  best  fit  for  betweenness  and  density  network  measures  was  the  Gamma  Distribution. 
This  invalidated  further  usage  of  the  CUSUM  Control  Chart  to  detect  changes  in  these 
network  measures  over  time. 

Observing  that  the  average  network  measures  followed  a  distribution  other  than 
the  normal  distribution,  violates  the  central  limit  theorem  and  warranted  further 
investigation.  Upon  deeper  exploration  of  the  data,  it  was  found  that  certain  subjects 
stopped  sending  email  at  some  point  in  the  study  and  did  not  send  email  again.  The 
principal  investigator  interviewed  these  subjects  and  found  that  they  had  experienced 
technical  problems  during  the  study  and  had  reformatted  their  hard  drive,  thereby  erasing 
the  collection  patch.  Other  subjects  began  to  rely  on  webmail,  which  bypassed  the 
collection  patch.  Therefore,  the  communication  data  collected  was  incomplete  and  not 
identically  distributed.  Subjects,  whose  data  collection  was  incomplete,  were  eliminated 
from  further  study.  Average  network  measures  calculated  on  the  reduced  data  set  did 
follow  a  normal  distribution.  A  communication  network  for  the  reduced  data  set  is 
shown  in  Figure  1  for  the  week  of  29  October  2007. 
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Using  this  much  smaller,  but  complete  network,  the  three  network  measures  of 
interest  were  all  found  to  be  normally  distributed.  Determining  baseline  values,  however, 
was  still  not  possible  because  the  network  contained  too  much  variance.  There  was  no 
stable  network  measure  behavior.  In  order  to  account  for  the  variance  caused  by  differing 
schedules  week  to  week,  we  examined  a  copy  of  the  TOEP  planning  calendar  for  the 
entire  year.  The  calendar  combined  with  interviews  with  participants  allowed 
investigators  to  determine  the  number  of  significant  events  from  a  variety  of  categories 
that  occurred  each  week.  The  significant  events  based  on  qualitative  assessments  by  the 
participants  were  Academic  Requirements,  the  Next  Week’s  Academic  Requirements, 
Administrative  Events  (such  as  a  class  trip  or  cancelled  class),  Group  Projects,  Social 
Gatherings,  and  Days  Off. 

Using  MINITAB  Statistical  Software,  analysis  of  variance  (ANOVA)  tests  were  run  on 
predictors  to  determine  if  they  were  statistically  significant  factors  in  determining 
network  measures.  Days  Off  was  the  most  significant  factor,  due  to  Christmas  break  in 
the  middle  of  the  24  week  study,  however  once  these  weeks  were  removed  from  the 
study.  Days  Off  was  no  longer  a  significant  factor  in  any  model.  The  best  linear 
regression  model  obtained  from  first  semester  (12  weeks)  data  for  closeness  based  on  the 
number  of  group  projects,  the  number  of  social  gatherings,  and  the  number  of  emails  sent 
each  week  found  in  Table  1  was, 

Qoseness  =  0.18  “  0.11  (Group  Projects  )  + 0.11  (Social  Gatherings  )  + 0.0074  (Number  of  Emails  ) 


Table  1  ANOVA  Table  for  Closeness  Predictors 
Predictor  Coefficient  SE  Coefficient  T  P  VIE 

Constant  0.18  0.034  5.4  0 
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Group  Projects  -0.11  0.05  -2.1  0.05  1.3 

Social  0.11  0.04  2.89  0.01  1.3 

Number  of  Emails  0.0074  0.00084  8.77  0  1 


This  model  has  an  adjusted  value  of  79.8%,  accounting  for  a  large  majority  of 
the  variance  in  the  network  measure  and  a  predictive  value  of  70.9%.  Slightly 
surprising  from  this  model  is  the  effect  of  group  projects  on  closeness.  An  increase  in 
group  project  work  was  correlated  with  a  decrease  in  communication.  This  might  be  due 
to  the  fact  that  as  a  group  project  comes  due,  the  subjects  may  communicate  more  with 
their  immediate  team  of  group  members,  and  communicate  more  face-to-face,  but  overall 
they  decrease  communication  outside  of  their  working  groups  and  through  email  in  order 
to  focus  on  the  project.  The  positive  effects  of  Social  Gatherings  and  more  emails  sent 
over  the  week  had  the  foreseen  effect  of  improving  group  closeness. 

The  model  created  from  the  first  semester  was  used  to  predict  the  average 
closeness  value  for  the  second  semester.  The  CUSUM  control  chart  was  applied  to  the 
residual  error  between  the  prediction  and  the  actual  second  semester  data.  This  allowed 
the  investigators  to  conduct  real-time  monitoring  of  a  social  group  for  change. 

Al  Qaeda  Communications  Network 

The  Center  for  Computational  Analysis  of  Social  and  Organizational  Systems 
(CASOS)  at  Carnegie  Mellon  University  created  snapshots  of  the  annual  communication 
between  members  of  the  al  Qaeda  organization  from  its  founding  in  1988  until  2004  from 
open  source  data  (Carley,  2006).  The  data  is  limited  in  that  we  do  not  know  the  type, 
frequency,  or  substance  of  the  communication  and  all  links  are  non-directional,  meaning 
we  do  not  know  who  initiated  communication  with  whom.  Finally,  the  completeness  of 
the  data  is  uncertain  since  it  only  contains  information  available  from  open  sources.  The 
data  is  unique  in  that  it  provides  a  network  picture  of  a  robust  network  over  standard 
time-periods  of  one  year. 
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Figure  2  Monitored  al  Qaeda  Communication  Network  for  Year  2001 


Using  the  network  snapshots  for  each  year  time-period,  the  average  social 
network  measures  were  calculated  and  plotted  for  betweenness,  closeness,  and  density. 
Each  of  these  measures  increased  from  1988  until  1994,  and  then  leveled  off.  There  are 
many  possible  reasons  for  this  burn-in  period,  such  as  the  quality  of  our  intelligence 
gathering  on  al  Qaeda  and  the  rapid  development  and  reorganization  of  a  fast  growing 
organization.  In  al  Qaeda’s  early  years,  access  to  the  infant  organization  may  have  been 
limited,  as  well  as  the  resources  devoted  to  tracking  a  small,  new,  and  relatively 
unaccomplished  terrorist  network.  The  organization  itself  may  have  also  been  changing 
drastically  during  its  first  years  by  actively  recruiting  new  members,  and  shifting  its 
structure  to  accommodate  new  resources  and  infrastructure.  For  this  reason,  the  averages 
for  each  measure  and  standard  deviation  were  calculated  over  the  five  years  that  follow 
the  burn-in  period  that  ended  in  1994.  The  CUSUM  control  chart  was  then  used  to 
monitor  the  three  measures  above  from  1994  to  2004.  Figure  3  displays  the  plot  of  each 
average  social  network  measure  in  the  Al-Qaeda  network.  The  general  trends  for  each  of 
these  measures  are  the  same  throughout  the  entire  time  period. 
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Figure  3  Plot  of  Selected  Network  Measures  of  al  Qaeda  Organization 

5.  RESULTS 

The  approach  proposed  in  this  paper  was  found  to  be  successful  at  predicting  the 
most  significant  events  in  both  data  sets.  Although  the  approach  varied  slightly  between 
the  two  data  sets,  we  have  been  able  to  show  that  statistical  process  control  is  effective  at 
identifying  organizational  change  in  these  two  social  groups.  For  the  TOEP  data  set, 
there  were  relatively  few  nodes  and  many  time  periods.  Therefore,  variance  effects  were 
much  stronger.  It  was  necessary  to  control  for  this  effect  by  constructing  a  statistical 
model  and  conducting  statistical  process  control  on  the  model  residuals.  With  the  al 
Qaeda  data  set,  however,  we  were  able  to  conduct  statistical  process  control  directly  on 
the  network  measures,  due  the  greater  stability  of  the  network  measures. 

Being  able  to  predict  the  closeness  of  the  TOEPs  communication  network  was 
essential  in  explaining  much  of  the  variance  in  the  network.  The  control  chart  could  then 
be  used  to  determine  when  the  network  changed  away  from  the  model.  In  effect,  when  is 
the  model  no  longer  providing  a  good  prediction?  Using  the  closeness  model  developed 
from  data  obtained  during  the  first  semester  of  the  TOEP  graduate  program,  predicted 
values  were  calculated  for  each  week  of  the  second  semester  using  the  number  of  social 
gatherings  and  group  projects  from  the  TOEP  calendar  and  the  number  of  emails  sent  by 
observation.  These  were  compared  with  the  observed  network  measures.  The  residuals 
were  verified  as  normally  distributed  to  meet  the  prerequisites  of  the  CUSUM  Control 
Chart.  The  C*  and  C  statistics  were  calculated  for  each  week  using  a  k  value  of  0.5  and  a 
control  limit  of  3.  By  running  a  Monte  Carlo  simulation  with  these  settings,  we  were  able 
to  predict  that  the  CUSUM  would  have  a  false  alarm  rate  of  once  out  of  every  59 
observations  or  practically  once  every  year.  A  graph  of  the  CUSUM  statistic  is  in  Figure 
4. 


10 


Figure  4  Plot  of  closeness  CUSUM  statistic  for  nine  active  TOEPs 

Figure  4  indicates  that  the  control  chart  signals  on  Week  23  (see  Table  2).  Week 
23  was  the  week  that  the  TOEPs  took  the  comprehensive  exam  for  their  graduate 
program.  It  was  the  most  significant  academic  event  of  the  year.  Tracing  the  C  statistic 
back  to  the  last  time  it  was  zero,  the  most  likely  change  point  was  during  Week  21.  Upon 
first  examination,  Week  21  looks  like  it  should  be  a  typical  academic  week,  with  no 
unusual  events  or  graded  projects.  However,  based  on  interviews  conducted  with  TOEPs 
after  the  signal  was  detected,  it  was  discovered  that  Week  21  was  a  critical  preparation 
week  prior  to  the  comprehensive  exam  when  the  study  questions  for  the  exam  were  sent 
to  the  students.  Thus,  the  CUSUM  control  chart  signals  on  Week  23  as  it  represents  a 
significant  departure  from  the  value  predicted  by  the  model. 


'able  2  CUSUM  Statistic  Values  for  Closeness  Network  Measure 


Week 

Closeness 

Model 

Z 

C+ 

C- 

15 

0.3332 

0.4712 

-1.9714 

0.0000 

1.4714 

16 

0.5134 

0.3798 

1.9086 

1.4086 

0.0000 

17 

0.2760 

0.3798 

-1.4829 

0.0000 

0.9829 

18 

0.3332 

0.3562 

-0.3286 

0.0000 

0.8114 

19 

0.5406 

0.5243 

0.2329 

0.0000 

0.0786 

20 

0.6536 

0.5745 

1.1300 

0.6300 

0.0000 

21 

0.4977 

0.3916 

1.5157 

1.6457 

0.0000 

22 

0.1258 

0.2913 

-2.3643 

0.0000 

1.8643 

23 

0.2646 

0.4215 

-2.2414 

0.0000 

3.6057 

24 

0.5226 

0.4152 

1.5343 

1.0343 

1.5714 

The  CUSUM  control  chart  implemented  on  the  residuals  of  a  communication 
model  proved  to  be  effective  at  detecting  organizational  change  in  the  TOEP  program.  It 
is  also  interesting  to  note,  that  a  decrease  in  communication  can  indicate  that  a  major 
event  is  about  to  occur,  as  the  subjects  rely  less  on  email  and  more  on  face-to-face 
communication  and  study  groups. 
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The  success  of  social  network  change  detection  on  the  TOEP  data  set  warranted 
further  investigation.  The  al  Qaeda  data  set  offered  data  with  more  nodes,  that  were 
aggregated  over  a  much  larger  time  period.  At  the  same  time,  we  were  able  to  identify  at 
least  one  major  event  in  al  Qaeda’s  history.  The  question  was  asked,  “can  we  identify 
September  1 1  from  the  social  network?”  Perhaps  more  importantly,  “can  we  identify  the 
point  in  time  when  the  organization  changed  into  such  a  threatening  menace?” 

The  reference  value,  k,  and  the  control  limit,  h,  were  set  at  0.5  and  4  respectively 
for  all  of  the  social  network  control  charts  based  on  no  other  reason  than  widely  used 
industry  standards  (McCulloh,  2004).  This  would  correspond  to  a  false  alarm  once  every 
168  years.  Figure  5  shows  the  CUSUM  statistic  for  the  average  closeness  that  is  plotted 
in  Figure  4.  It  can  be  seen  that  the  CUSUM  statistic  in  Figure  5  is  a  more  dramatic 
indication  of  network  change  than  simply  monitoring  the  network  measure  in  Figure  4. 
This  is  a  result  of  the  CUSUM  statistic  taking  into  account  previous  observations  and 
deviations  from  the  mean  in  the  network  measure.  A  single  observation  of  a  network 
measure  that  is  slightly  higher  than  normal  may  not  indicate  a  change  in  the  network; 
however  multiple  observations  that  are  slightly  higher  than  normal  may  indicate  a  shift  in 
the  mean  of  the  measure. 


Year 

Figure  2  Plot  of  Closeness  CUSUM  Statistic  of  al  Qaeda 

Recall  that  the  CUSUM  will  detect  either  increases  or  decreases  in  a  measure,  but 
not  both.  Therefore,  two  control  charts  must  be  run  for  each  social  network  measure 
monitored.  One  chart  is  used  to  detect  increases  and  the  other  chart  for  decreases.  Table 
3  displays  the  CUSUM  statistic  values  for  closeness  measure.  The  trends  in  the  data  for 
the  closeness  measure  are  the  same  as  the  betweenness  and  density  measures. 
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T able  3  CUSUM  Statistic  Values  for  Closeness  Network  Measure 


Time 

Closeness 

z 

C^ 

C 

1994 

0.0027 

-0.8729 

0.0000 

0.3729 

1995 

0.0030 

1.0911 

0.5911 

0.0000 

1996 

0.0028 

-0.2182 

0.0000 

0.0000 

1997 

0.0028 

-0.2182 

0.0000 

0.0000 

1998 

0.0031 

1.7457 

1.2457 

0.0000 

1999 

0.0030 

1.0911 

1.8368 

0.0000 

2000 

0.0032 

2.4004 

3.7372 

0.0000 

2001 

0.0034 

3.7097 

6.9469 

0.0000 

2002 

0.0024 

-2.8368 

3.6101 

2.3368 

2003 

0.0015 

-8.7287 

0.0000 

10.5655 

2004 

0.0004 

-15.9300 

0.0000 

25.9955 

It  can  be  seen  in  Table  3  that  the  CUSUM  statistic  exceeds  the  control  limit  of  4  and 
signals  that  there  might  be  a  significant  change  in  the  al  Qaeda  network  between  the 
years  2000  and  2001.  Therefore,  an  analyst  monitoring  al  Qaeda  would  be  alerted  to  a 
critical,  yet  subtle  change  in  the  network  prior  to  the  September  1 1  terrorist  attacks. 

The  CUSUM  control  chart  also  has  a  built  in  feature  for  determining  the  most 
likely  time  that  the  change  occurred.  This  time  is  identified  as  the  last  point  in  time  when 
the  CUSUM  statistic  is  equal  to  zero.  For  all  measures,  this  point  in  time  is  1997.  To 
understand  the  cause  of  the  change  in  the  al  Qaeda  network,  an  analyst  should  look  at  the 
events  occurring  in  al  Qaeda’s  internal  organization  and  external  operating  environment 
in  1997. 

Several  very  interesting  events  related  to  al  Qaeda  and  Islamic  extremism 
occurred  in  1997.  Six  Islamic  militants  massacred  58  foreign  tourists  and  at  least  four 
Egyptians  in  Luxor,  Egypt  (Jehl,  1997).  United  States  and  coalition  forces  deployed  to 
Egypt  in  1997  for  a  bi-annual  training  exercise  were  repeatedly  attacked  by  Islamic 
militants.  The  coalition  suffered  numerous  casualties  and  shortened  their  deployment.  In 
early  1998,  Zawahiri  and  Bin  Laden  were  publicly  reunited,  although  based  on  press 
release  timing,  they  must  have  been  working  throughout  1997  planning  future  terrorist 
operations.  In  February  of  1998,  an  Arab  newspaper  introduced  the  “International 
Islamic  Front  for  Combating  Crusaders  and  Jews.”  This  organization  established  in 
1997,  was  founded  by  Bin  Laden,  Zawahiri,  leaders  of  the  Egyptian  Islamic  Group,  the 
Jamiat-ul-Ulema-e-Pakistan,  and  the  Jihad  Movement  in  Bangladesh,  among  others.  The 
Front  condemned  the  sins  of  American  foreign  policy  and  called  on  every  Muslim  to 
comply  with  God’s  order  to  kill  the  Americans  and  plunder  their  money.  Six  months 
later  the  US  embassies  in  Tanzania  and  Kenya  were  bombed  by  al  Qaeda.  Thus,  1997 
was  possibly  the  most  critical  year  in  uniting  Islamic  militants  and  organizing  al  Qaeda 
for  offensive  terrorist  attacks  against  the  United  States. 
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6.  CONCLUSION 


Control  charts  are  a  critical  quality-engineering  tool  that  assist  manufacturing 
firms  in  maintaining  profitability  (Montgomery,  1991;  Ryan,  2000).  The  TOEP  and  al 
Qaeda  examples  demonstrate  that  social  network  monitoring  could  enable  analysts  to 
deteet  important  ehanges  in  the  monitored  communication  of  both  command  and  control 
networks  as  well  as  terrorist  networks.  Furthermore,  the  most  likely  time  that  the  change 
oeeurred  can  also  be  determined.  This  allows  one  to  alloeate  minimal  resources  to 
tracking  the  general  patterns  of  a  network  and  then  shift  to  full  resources  when  changes 
are  determined!. 

This  paper  deseribes  an  algorithm  for  ehange  deteetion,  and  then  demonstrates  its 
ability  to  detect  changes  in  networks.  No  doubt  other  change  detection  methods  will 
emerge.  Our  point,  is  that  it  is  critieal  to  be  able  to  detect  change  in  networks  over  time 
and  to  determine  when  those  changes  are  not  simply  the  random  fluctuations  of  chance. 
The  strengths  of  the  proposed  method  are  its  statistieal  approaeh,  ability  to  quantify  the 
rate  of  false  alarm,  a  wide  range  of  social  network  metrics  suitable  for  application,  its 
ability  to  identify  ehange  points  in  organizational  behavior,  and  its  flexibility  for  various 
magnitudes  of  change.  The  proposed  method  is  limited  to  normally  distributed  network 
measures,  and  a  period  of  dynamic  equilibrium  must  be  assumed  to  estimate  parameters 
of  the  control  chart.  Other  limitations  of  the  algorithm  cannot  yet  be  determined  as  this  is 
the  first  application  of  statistical  process  control  methods  to  the  problem  of  social 
network  change  detection.  Future  research  will  provide  much  greater  insight  into  the 
strengths  and  limitations  of  this  approaeh  to  the  problem.  The  remainder  of  this  section 
will  identify  specific  areas  of  caution  when  interpreting  findings  and  identify  areas  for 
future  research. 

The  empirical  results  described  in  this  paper,  such  as  the  detection  of  change  in 
the  al  Qaeda  network  should  be  viewed  with  caution.  We  present  them  here  purely  to 
illustrate  the  methodology.  Limitations  on  the  data  make  it  difficult  to  determine  the 
validity  of  the  results;  thus,  we  should  simply  view  these  results  as  showing  the  promise 
of  this  methodology.  The  IkeNet  data  is  a  small  sample  capturing  only  email  traffic  and 
not  all  communication  and  interaction  among  participants.  The  faet  that  even  in  this 
small  sample  of  behavior  we  were  able  to  systematically  detect  a  key  change  suggests  the 
value  of  the  proposed  approaeh..  The  al  Qaeda  data,  was  based  on  open  source 
information.  As  such  it  is  an  incomplete  representation  of  interaction  in  that  terror 
network.  We  eannot  be  sure  that  we  have  the  entire  communication  network,  or  even  a 
true  picture  of  the  observed  communication  network.  However,  the  fact  that  our 
technique  deteets  a  ehange  eorresponding  with  the  9/1 1  attaeks  is  intriguing.  This  work 
suggests  that  our  approach  may  provide  some  ability  to  detect  change  even  when  there  is 
incomplete  information. 


!  Two  social  network  change  detection  algorithms  (Shewhart  X-Bar  and  the  Cumulative  Sum)  are 
available  in  the  “Statistical  Network  Monitoring  Report”  in  the  software  tool,  Organizational  Risk  Analyzer 
(ORA)  available  through  the  Center  for  Computational  Analysis  of  Social  and  Organizational  Systems 
(CASOS),  http://www.casos.cmu.edu. 
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That  being  said,  it  is  important  that  future  work  examine  the  errors  associated 
with  this  technique,  both  the  false  positives  and  false  negatives.  Future  work  should  also 
consider  the  sensitivity  of  this  approach  to  missing  information,  and  to  the  reason  why 
the  information  is  missing.  For  example,  data  sets  collected  post-hoc  that  focus  on 
activity  around  an  event,  such  as  the  al  Qaeda  data  are  prone  to  errors  of  missing  nodes 
and  as  a  result  links  prior  to  the  event.  Whereas,  data  sets  collected  based  on  opportunity, 
such  as  the  IkeNet  data,  are  prone  to  missing  links  among  the  nodes. 

In  order  to  rectify  the  above  shortcomings,  future  research  should  focus  on  near- 
complete  datasets  with  high  resolution.  Higher  resolution  involves  taking  many 
snapshots  of  the  network.  This  may  mean,  simply  an  increase  in  frequency,  e..g.  changes 
by  month,  or  it  may  mean  a  longer  time  horizon,  e.g.,  more  years.  The  right  choice  will 
depend  on  the  problem  where  we  want  to  detect  network  change.  More  data  points  will 
provide  more  opportunities  to  detect  changes  while  they  are  still  small,  instead  of 
allowing  them  to  incubate  and  grow  as  was  the  case  for  the  al  Qaeda  data.  Larger 
datasets  will  also  provide  near  continuous  network  measures  permitting  the  use  of  control 
charts  for  continuous  data.  Near  complete  data  means  that  the  data  should  cover  the 
communication  network  with  little  or  no  missing  information  for  a  large  contiguous 
period.  Here  one  might  consider  simply  tracking  a  group  in  general,  as  opposed  to 
focusing  on  tracking  relative  to  a  specific  event.  Data  such  as  that  on  the  US  Congress  or 
Supreme  Court  that  is  regularly  output  might  provide  a  good  source  of  data. 

Another  limitation  of  this  approach  is  that  it  assumes  that  network  measures  are 
normally  distributed.  Research  on  the  distributions  is  needed.  Preliminary  work  on  these 
distributions  suggests  that  the  assumption  of  normality  does  not  hold  for  small  networks, 
extremely  sparse  networks,  and  for  certain  metrics  (Kim  and  Carley,  working  paper). 
Future  work  should  consider  these  factors  to  determine  the  range  of  networks  for  which 
this  approach  will  work.  Clearly,  if  the  network  measures  are  normally  distributed,  the 
CUSUM  control  chart  can  be  used  to  monitor  network  change.  If  they  are  not,  a  different 
control  chart  must  be  used  or  a  new  approach  at  the  problem  made.  Future  work  should 
address  this  issue. 

Future  research  should  also  look  at  the  sensitivity  of  the  optimality  constant,  k  and 
control  limit  values  of  the  CUSUM  Control  Chart  for  network  measure  change  detection. 
As  stated  earlier,  these  values  are  generally  arbitrarily  chosen  and  then  optimized  for  the 
process.  By  using  further  Monte  Carlo  simulations,  a  researcher  should  determine  which 
parameter  value  would  be  best  in  detecting  certain  types  of  changes  such  as  sudden  large 
changes  or  slow  creeping  shifts.  Usage  of  control  charts  on  comparing  models  and 
observations  should  also  be  studied  to  see  what  specific  conclusions  can  be  obtained. 

Multi  agent  simulations  would  also  provide  valuable  insight  into  the  performance 
of  control  charts  for  social  network  change  detection  applications.  Simulations  would 
allow  an  investigator  to  introduce  various  changes  into  a  simulated  organization  and 
evaluate  the  time  to  detect  for  different  algorithms.  Simulations  provide  an  efficient 
means  of  evaluating  change  detection  on  social  networks.  More  importantly,  however,  is 
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the  ability  to  create  more  controlled  experiments,  by  fixing  certain  variables,  exploring 
others,  and  using  many  replications  to  estimate  error.  Simulation  studies  will  be 
extremely  useful  in  exploring  extensions  of  this  methodology. 

Social  network  change  detection  is  important  for  identifying  significant  shifts  in 
organizational  behavior.  This  provides  insight  into  policy  decisions  that  drive  the 
underlying  change.  It  also  shows  the  promise  of  enabling  predictive  analysis  for  social 
networks  and  providing  early  warning  of  potential  problems.  In  the  same  way  that 
manufacturing  firms  save  millions  of  dollars  each  year  by  quickly  responding  to  changes 
in  their  manufacturing  process,  social  network  change  detection  can  allow  senior  leaders 
and  military  analysts  to  quickly  respond  to  changes  in  the  organizational  behavior  of  the 
socially  connected  groups  they  observe.  The  combination  of  statistical  process  control 
and  social  network  analysis  is  likely  to  produce  significant  insight  into  organizational 
behavior  and  social  dynamics.  Immediate  applications  to  counter  terrorism  are  obvious. 
As  a  scientific  community  we  can  hope  to  see  more  research  in  this  area  as  network 
statistics  continue  to  improve. 
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